Genetic risk score for insulin resistance based on gene variants associated to amino acid metabolism in young adults

Circulating concentration of arginine, alanine, aspartate, isoleucine, leucine, phenylalanine, proline, tyrosine, taurine and valine are increased in subjects with insulin resistance, which could in part be attributed to the presence of single nucleotide polymorphisms (SNPs) within genes associated with amino acid metabolism. Thus, the aim of this work was to develop a Genetic Risk Score (GRS) for insulin resistance in young adults based on SNPs present in genes related to amino acid metabolism. We performed a cross-sectional study that included 452 subjects over 18 years of age. Anthropometric, clinical, and biochemical parameters were assessed including measurement of serum amino acids by high performance liquid chromatography. Eighteen SNPs were genotyped by allelic discrimination. Of these, ten were found to be in Hardy-Weinberg equilibrium, and only four were used to construct the GRS through multiple linear regression modeling. The GRS was calculated using the number of risk alleles of the SNPs in HGD, PRODH, DLD and SLC7A9 genes. Subjects with high GRS (≥ 0.836) had higher levels of glucose, insulin, homeostatic model assessment- insulin resistance (HOMA-IR), total cholesterol and triglycerides, and lower levels of arginine than subjects with low GRS (p < 0.05). The application of a GRS based on variants within genes associated to amino acid metabolism may be useful for the early identification of subjects at increased risk of insulin resistance.


Introduction
The presence of different cardiometabolic risk factors and components of the metabolic syndrome such as obesity, dyslipidemia, hypertension, hyperglycemia and insulin resistance (IR) increase the risk of cardiovascular disease and type 2 diabetes (T2D) [1,2].Notably, changes in the circulating amino acid profile are related to IR [3].In fact, subjects with IR have higher serum concentrations of branched chain amino acids (BCAA), aromatic amino acids (AAA), glutamine, glutamate and lower levels of glycine than subjects without IR [4].Even young adults with IR have higher plasma concentration of arginine, alanine, aspartate, isoleucine, leucine, phenylalanine, proline, tyrosine, taurine and valine [5].Moreover, different epidemiological studies have reported that BCAA (isoleucine, leucine and valine), AAA (phenylalanine and tyrosine) and glutamine could predict the development of T2D [6][7][8].
Amino acid levels change with age and the presence of SNPs involved in BCAA metabolism in subjects with obesity and MetS [9,10].Interestingly, the joint presence of two SNPs, the BCAT2 (Branched Chain Amino Acid Transaminase 2) rs11548193 and BCKDH (Branched Chain Keto Acid Dehydrogenase) rs45500792, has higher circulating levels of aspartate, isoleucine, methionine, and proline than the subjects homozygotes for the most common allele [11].This evidence reflects that the sum of various risk alleles may provide a better estimation of plasma amino acid levels and IR.Actually, genome-wide association studies (GWAS) have identified multiple SNPs that influence serum concentrations of circulating metabolites, including amino acids such as BCAA, AAA, histidine and glutamine [12].This has led us to speculate whether the presence of different SNPs related to amino acid metabolism could alter their plasma concentrations and, moreover, help us to predict subjects with higher risk to develop IR.This could be achieved through the development of a GRS, which represents the cumulative contribution of risk alleles from various SNPs on a specific outcome of interest within an individual.Combining several variants into a GRS can capture an individual's susceptibility to a disease [13][14][15].Therefore, the aim of this study was to develop a GRS to predict the risk of IR in young Mexican adults based on the determination of some selected SNPs of genes related to the metabolism of amino acids.

Study population
We carried out a cross-sectional study.Subjects from the general population who were carrying out university admission procedures were invited to participate in the study.Subjects were recruited from June 16 th , 2014 to July 3 rd , 2014, at the Universidad Auto ´noma de San Luis Potosı ´(UASLP) in San Luis Potosı ´, Me ´xico.Subjects received detailed information about the study, and those who wished to participate gave their written informed consent.The identity information of all patients was coded to ensure that privacy was not compromised.The study was designed in accordance with the Declaration of Helsinki and the ethical treatment of human subjects, and was approved by the Ethics Committee of the National Institute of Medical Sciences and Nutrition Salvador Zubira ´n (Registration number 669).The participants were Mexicans from 18 to 25 years old with body mass index (BMI) � 18.5 and < 40 kg/m 2 .The exclusion criteria included pregnancy, substance abuse, history of cardiovascular events, chronic diseases (including individuals previously and newly diagnosed with T2D), and treatment with hypoglycemic, antihypertensive agents, agents used to treat dyslipidemias, steroids, and immunosuppressors.The elimination criterion was the voluntary withdrawal of the participants.Subjects were evaluated by medical examination to collect anthropometric and clinical measurements, and underwent blood sample collection for biochemical analysis, DNA extraction and SNPs determination.Among the biochemical variables, amino acids were determined (alanine, arginine, aspartic and glutamic acids, glycine, histidine, isoleucine, leucine, lysine, methionine, phenylalanine, proline, serine, threonine, tyrosine and valine).

Anthropometric measurements
Anthropometric evaluation was performed after a 12-h fast.Height was measured with a mobile SECA 1 stadiometer (Seca 213, USA), and weight was obtained twice using a TANITA 1 calibrated electronic device (Tanita UM-081, Kyoto, Japan).Body mass index (BMI) was calculated using Quetelet's formula: The subjects were classified according to the BMI based on the World Health Organization [16].

Clinical measurements
Systolic blood pressure (SBP) and diastolic blood pressure (DBP) measurement was performed in the dominant right arm and in a sitting position using an OMRON 1 digital sphygmomanometer (HEM-7130, Kyoto, Japan) and appropriately sized cuffs according to clinical standards [17].We considered altered blood pressure with a cut-off point of � 130/85 [18].

Biochemical analyses
Blood samples were collected from the subjects after a 12-hour fast, and serum was subsequently extracted.Glucose, total cholesterol, high density lipoprotein cholesterol (HDL-C), low density lipoprotein cholesterol (LDL-C) and triglycerides were measured enzymatically using an Ortho Clinical Vitros 250 Chemistry System ( © Ortho-Clinical Diagnostics, Inc. Raritan, NJ.).Insulin and leptin were measured by radioimmunoassay (RIA, Millipore, Billerica, MA, USA).IR was obtained through the HOMA-IR [19].
IR was established with a HOMA-IR value � 2.5 [20,21].Amino acids were measured by high performance liquid chromatography (HPLC).Briefly, 50 μL of 10% sulfosalicylic acid was added to 200 μL of serum, incubated for 30 min at 4 ˚C, and centrifuged at 14,000 rpm for 20 min.The supernatant was obtained and one microliter of internal standard (25 mM norvaline) was added prior to derivatization and injection.For derivatization, o-phthalaldehyde (OPA) and 9-fluorenylmethyl chloroformate (FMOC) were used.Derivatization and injection were carried out using a sampling device (Agilent; G1367F) coupled to an Agilent 1260 Infinity HPLC with fluorescent detector (Agilent; G1321B).A ZORBAX Eclipse AAA column was used at 40 ˚C and the chromatographic conditions indicated by the manufacturer (Agilent; 5980-1193) were applied [22].

SNP selection
We performed a bibliographic search to identify SNPs present in genes related to amino acid metabolism (such as catabolic enzymes or amino acid transporters), which were previously associated with alterations in plasma amino acids concentration and/or with cardiometabolic risk factors.The SNPs were selected when a frequency > 10% was reported for the Mexican or Latino population using data managers such as: Genecards (https://www.genecards.org/),GWAS catalog (https://www.ebi.ac.uk/gwas/),DisgeNet (https://www.disgenet.org/)and NCBI dbSNP (https://www.ncbi.nlm.nih.gov/snp/).Additionally, non-synonymous SNPs were preferentially selected.Finally, 18 SNPs that met the selection criteria were chosen (Table 1 in S1 Appendix).
These 18 SNPs were analyzed using allelic discrimination assays using TaqMan probes (AppliedBiosystems 1 ) by the real time polymerase chain reaction (RT-PCR) on a LightCycler 1 480 instrument (Roche 1 ).Briefly, a master mixture was prepared considering for each sample 0.75 μL of TaqMan probe, 0.25 μL of molecular grade nuclease-free ultrapure water (USB 1 , USA), and 5 μL of Probes Master (LightCycler 1 480), following the manufacturer's instructions.Then, to perform PCR, 4 μL of previously adjusted DNA and 6 μL of the master mixture were added to each well of the 96-well plates (Roche 1 ).Negative controls were also included, which only carried the master mixture and nuclease-free water.The reactions were performed in duplicate.The cycling conditions consisted of an initial pre-incubation cycle at 95 ˚C for 10 min, followed by 45 cycles of denaturation at 95 ˚C for 12 s, annealing at 60 ˚C for 50 s and extension at 72 ˚C for 2 s and a cooling cycle at 40 ˚C for 30 s.For allelic discrimination results, the context sequence for each Taqman probe and the fluorophores targeting each allele were previously verified based on information reported by the manufacturer.

GRS calculation
We constructed a multilocus GRS using the SNPs (n = 10) that were in Hardy-Weinberg equilibrium (p > 0.05) (Table 2 in S1 Appendix).The GRS was calculated for each individual as the sum of the number of IR risk alleles based on the highest HOMA-IR value (preference cut-off � 2.5).Thus, we developed a simple GRS in apparently healthy subjects, using the allele of the SNP that, according to our hypothesis and what was reported, would influence the risk of IR [9,[24][25][26] (Table 3 in S1 Appendix).Briefly, for each of the SNPs, we assigned a value of 0, 1, or 2, with the highest value representing homozygotes with the high-risk alleles for IR, and the lowest value representing homozygotes with the low-risk alleles.A value of 1 denoted heterozygotes.Multiple linear regression was performed to develop a GRS for IR risk, using the HOMA-IR value as the dependent variable and each SNP as an independent variable.This process resulted in distinct models.The model of SNPs exhibiting a significant association was selected.Among these SNPs, three were found to be significantly associated with IR (p < 0.05), while one exhibited marginal significance (p = 0.057).
Thus, only four SNPs served as predictors of IR risk and were used to calculate a weighted GRS for each individual.This involved multiplying the standardized β coefficient by the effect size (0, 1 or 2) for each SNP, followed by summing the scores obtained from the four SNPs for each subject.
Where k is the number of independent genetic variants associated with IR, N i corresponds to the effect size (0, 1 or 2) for each SNP, that is, the number of risk alleles for each individual (i = 1), and β is the coefficient estimated for each SNP associated with the HOMA-IR.

Statistical analyses
Continuous variables were presented as median and interquartile range (25 th -75 th percentiles) or as a mean and standard deviation.These variables were evaluated using the Kolmogorov-Smirnov Z Test to analyze their distribution.The dichotomous or nominal variables were expressed as frequencies and percentages.The Student T test was used for variables with a parametric distribution, while the Mann-Whitney U test was used for non-parametric variables to analyze differences in anthropometric, clinical, and biochemical data.Genotype frequencies were analyzed using chi-square analysis to assess Hardy-Weinberg equilibrium (p > 0.05).
For GRS, the effect of each SNP on the HOMA-IR variable was first assessed using a general linear model adjusted for age, sex and BMI.Then, multiple linear regression analysis was used to assess the association between HOMA-IR (dependent variable) and the 10 SNPs (independent variables).Non-collinearity was previously evaluated between the independent variables.The backward-stepwise method was used to select the final model.Significant SNPs were used for the GRS.Moreover, we evaluated the association between the obtained GRS and the HOMA-IR variable adjusting for age, sex, and BMI using a generalized linear model.Subsequently, the GRS was categorized into tertiles.This categorization was used to assess the trends in each anthropometric, clinical, and biochemical variable among the subjects using the Jonckheere-Terpstra test.
Lastly, ANOVA and Bonferroni post-hoc test with and without adjustment for covariates (age, BMI and sex) were used to assess differences in the variables of interest and the GRS.Previously, the nonparametric data were logarithmically transformed.Differences were considered significant at p < 0.05.Data were analyzed using SPSS software version 20.0 (SPSS Inc., USA).

Characteristics of subjects
We analyzed 452 subjects, 46.7% women and 53.3% men with a median age of 19 (18)(19)(20) years.Based on the clinical and biochemical evaluation, SBP, DBP, serum glucose levels, total cholesterol, HDL-C, LDL-C, triglycerides, insulin, leptin and HOMA-IR were within reference limits.However, considering the 75 th percentile, 25% of the subjects exhibited HOMA-IR levels > 3.08, 30.5% were classified as overweight, and 10.6% as obese according to their BMI (Table 1).
When classified by sex, weight, SBP, DBP, glucose and triglycerides levels were higher in men, while HDL-C, insulin and leptin levels were higher in women (Table 1).Regarding serum amino acids levels, aspartate, serine and arginine were significantly higher in women, while histidine, methionine, isoleucine, leucine, valine and the sum of BCAAs were higher in men (Table 1).When categorizing the subjects based on IR, we observed that 44.5% of the subjects presented IR.As expected, individuals with IR showed higher weight, BMI, SBP, DBP, glucose, triglycerides, insulin and leptin levels, while their HDL-C levels were lower compared to those without IR (Table 2).
Moreover, we observed that subjects with IR had higher levels of aspartate, glutamate, arginine, alanine, tyrosine, methionine, phenylalanine, lysine, isoleucine, leucine, valine and the sum of BCAAs, while glycine levels were lower than subjects without IR (Table 2).

Genotype frequencies
We determined the genotypic frequencies of the 18 SNPs among the subjects.We found that all homozygotes with the common allele had a frequency higher than 35%, and particularly, the SNPs present in TAT, OTC, HAL, BCAT2 and BCKDH had a frequency higher than 80%.

GRS for HOMA-IR
Among all the models analyzed in the multiple linear regression analysis (Table 5 in S1 Appendix), the model with the highest number of SNPs significantly associated with HOMA-IR included the following: rs2255543 (HGD), rs5747933 (PRODH), rs6943999 (DLD) and rs1007160 (SLC7A9) (Table 3).Subsequently, the GRS was calculated based on the standardized β coefficient and the effect size for each SNP.The GRS explained 24.6% of the HOMA-IR variability adjusted by BMI, sex and age (R 2 = 0.246, p < 0.01).

Characteristics of subjects based on GRS
The GRS was categorized into tertiles (T1 = 149 subjects; T2 = 211 subjects; T3 = 92 subjects), revealing that 92 subjects carrying the risk alleles classified in the highest tertile (GRS-high) with a cut-off point � 0.836, which had significantly higher HOMA-IR values than subjects in the first (GRS-low) and second tertiles (GRS-medium) (Fig 1).Interestingly, subjects with a high GRS showed higher levels of glucose, total cholesterol, triglycerides and insulin levels (p < 0.05) than subjects with a low GRS (cut-off point � 0.624) without covariate adjustment.These results, except for total cholesterol, were maintained when evaluated with adjustment for age, sex and BMI (Table 4).Furthermore, subjects with a high GRS showed a positive and significant trend with higher levels in weight, BMI, glucose, total cholesterol, triglycerides, leptin, insulin and HOMA compared to subjects with medium and low GRS (p < 0.05) (Table 6 in S1 Appendix).
Finally, subjects with a low GRS had slightly higher arginine levels than subjects with a high GRS (p < 0.05) (Table 5).Some amino acids, such as proline exhibited a negative trend in their concentrations among subjects with a high GRS compared to those with a low GRS (p < 0.05) (Table 7 in S1 Appendix).Moreover, glycine exhibited a downward trend while alanine and BCAA showed an upward trend, although were not statistically significant (Table 7 in S1 Appendix).When classified by sex, we observed that woman with a low GRS had higher levels of aspartate, serine, and arginine, and lower levels of methionine, isoleucine, and leucine than men.However, women with a high GRS no longer exhibited differences in serine and methionine levels.Notably, women with a medium GRS had additionally lower levels of histidine and valine than men (Table 8 in S1 Appendix).

Discussion
Our study shows that a GRS calculated using the number of risk alleles of the SNPs rs2255543 in HGD, rs5747933 in PRODH, rs6943999 in DLD, and rs1007160 in SLC7A9 was associated  with HOMA-IR.Subjects with a high GRS had higher glucose, insulin, total cholesterol, triglycerides levels and lower arginine levels than subjects with a low GRS.Information regarding the potential causal relationship between these SNPs and IR remains limited.At this point, we can only speculate about their implications.Homogentisate 1,2-dioxygenase (HGD) is an enzyme involved in tyrosine metabolism that converts homogentisic acid (HGA) to malate and acetoacetate [27].Mutations in the HGD gene [28] lead to an autosomal recessive disorder known as alkaptonuria, which is characterized by the absence of HGD causing an accumulation of HGA [29].Alkaptonuria is characterized by dark urine, bluish-black pigmentation in the connective tissue and arthritis [30,31].Moreover, a decrease in HGD activity could potentially result in decreased malate levels, thereby impacting the tricarboxylic cycle, oxidative phosphorylation, as well as amino acid and glucose metabolism, as suggested for clear cell renal carcinoma [32].However, to our knowledge this is the first finding of a possible relationship between the rs2255543 in HGD and IR.Further studies are needed to understand the effect of this SNP and the activity of HGD and its consequences on IR development.
Proline dehydrogenase (PRODH) participates in the first step of proline catabolism.A previous study found that rs5747933 in PRODH was associated with high serum proline concentrations [33], suggesting that this SNP may decrease PRODH activity.High proline concentrations are associated with a higher incidence of T2D in the Chinese [34] and Japanese [35] adult population.Additionally, proline levels are positively correlated with IR in Mexican young adults [5].The exact mechanism by which altered proline levels are related to T2D or IR remains unclear.However, some hypotheses could be the following: a) the increase in proline might be related to pancreatic cell dysfunction.Prolonged proline exposure increased basal insulin secretion and decreased glucose-stimulated insulin secretion in both clonal INS1-E insulinoma cells and isolated rat islets [36,37].b) Proline may function as a redox modulator.Both proline synthesis and catabolism are intricately involved in redox-active mechanisms.For instance, the catabolic activity of PRODH generates ATP and, when excessively active, leads to an elevation in reactive oxygen species (ROS) production [37].Several studies have linked the production of ROS to IR [38][39][40].c) The modulation of glutamate production can impact glucagon secretion.Proline oxidation results in glutamate production, which in turn induces glucagon release in pancreatic alpha cells [41,42].Additionally, glutamate facilitates the conversion of pyruvate to alanine.Glucagon secretion stimulates hepatic gluconeogenesis, while the high availability of alanine serves as a gluconeogenic substrate, potentially amplifying this metabolic pathway [34].
Regarding the SNP rs6943999, it is in the promoter region of the DLD gene.Dihydrolipoamide dehydrogenase (DLD) is an enzyme that catalyzes the oxidation of NADH to NAD + in the glycine cleavage system.Moreover, DLD is the E3 component of three multienzyme dehydrogenase complexes (pyruvate, alpha-ketoglutaramate, and BCKDH complex).The BCKDH complex modulates BCAA catabolism.Subjects with IR and obesity have increased serum BCAAs levels [43], which could be due to both a decrease in the expression of BCAA catabolic enzymes or a decrease in its activity [44].The increase in BCAAs may cause the activation of the mammalian target of rapamycin (mTOR) pathway, subsequently activating downstream kinases such as p70S6 ribosomal kinase (p70S6K or S6K).This kinase can phosphorylate insulin receptor substrate (IRS-1), potentially leading to the suppression of insulin signaling on serine/threonine residues [45,46].Furthermore, DLD also conforms the pyruvate dehydrogenase complex, implying that a decrease in DLD expression might be associated to an accumulation of pyruvate, a gluconeogenic substrate, particularly during prolonged fasting.An excess of pyruvate would increase gluconeogenesis [47].That said, further research is needed to determine whether the presence of rs6943999 affects DLD expression altering BCAA and pyruvate homeostasis, and thus, IR.
SLC7A9 encodes for a sodium-independent cationic amino acids transporter, which is primarily responsible for the uptake of certain amino acids, such as cystine, lysine, arginine and neutral amino acids [48].However, to our knowledge, there are no studies reporting an association between rs1007160 and IR.Lower expression of amino acid transporters, including SLC7A9, has been observed in hepatocytes from mice with diet-induced obesity, and this decrease was associated with hepatic steatosis, hyperlipidemia, obesity, and IR [49,50].As a non-synonymous SNP, rs1007160 could potentially impact the structure or function of the transporter [51], resulting in a lower uptake of amino acids such as arginine and potentially affecting regulatory mechanisms.For example, arginine is the main substrate for nitric oxide synthesis.Through this pathway, arginine modulates glucose and lipid oxidation, and insulin sensitivity [52].In addition, arginine can also activate the mTOR signaling mechanism, promoting protein synthesis and cell growth [53].
Concerning the differences on amino acids levels observed between men and women, our results are consistent with previous reports demonstrating that men have higher histidine, methionine, tyrosine and BCAA concentrations than women [54][55][56].A possible explanation to the lower concentration of BCAAs in women could be related to the high catabolism of BCAAs in adipocytes [57], and the higher amount of body fat present in women [58].Moreover, methionine has been positively associated with adiposity; in fact, methionine restriction is thought to improve insulin sensitivity and increase weight loss in humans and mice [59,60].Interestingly, BCAA levels were lower in women, regardless of the GRS tertile.However, the differences observed in serine and methionine, when classified by sex in subjects with low GRS, were no longer present in subjects with high GRS.This suggests that the presence of SNPs may have a sex-dependent effect on certain amino acid catabolism, influencing their plasma concentrations, which requires further research for elucidation.
Our study has several strengths.Firstly, it stands as one of the initial studies to evaluate various SNPs in genes associated with amino acid metabolism, and their relationship with IR risk in a population of young Mexican adults.Secondly, these results provide evidence of novel SNPs linked to IR, along with the identification of amino acids as potential biomarkers for cardiometabolic risk.Thirdly, the implementation of the GRS might facilitate the early identification of young subjects at increased risk of IR.This approach, involving the evaluation of diverse SNPs, could have a greater clinical impact than the assessment of a single SNP alone.
While our findings must be validated in an independent population and should include an evaluation of the effect of the nutritional conditions of the subjects on their plasma amino acid levels, subjects with higher GRS may benefit from preventive lifestyle interventions and/or pharmacological treatment to reduce obesity to prevent the development of IR.Moreover, another limitation of our study lies in its cross-sectional design, which precludes to determine the causality of the results.Further research is required to evaluate whether these SNPs indeed harbor a causal relationship with the development of IR over a time interval, and to determinate how they impact amino acid concentration.In addition, the study focused on a specific population of young adults, which limits the generalizability of the findings to other age groups or populations.While the power analysis of the utilized sample size exceeded 80%, which is considered acceptable, further research with diverse cohorts would be valuable to validate the observed associations.
In conclusion, we calculated a GRS using the number of risk alleles of the SNPs in HGD, PRODH, DLD and SLC7A9 genes.Subjects with high GRS had higher levels of HOMA-IR, glucose, insulin, total cholesterol and triglycerides, and lower levels of arginine than subjects with low GRS.The application of a GRS based on variant of genes associated with amino acid metabolism may be useful for the early identification of subjects at increased risk of IR.

Fig 1 .
Fig 1. Insulin resistance, quantified by the HOMA-IR (homeostatic model assessment-insulin resistance), across groups stratified into tertiles according to the genetic risk score (GRS) derived from the best model in a total of 452 subjects.GRS-low = tertile 1 (cut-off point: 0.620); GRS-medium = tertile 2 (cut-off point: 0.742); GRShigh = tertile 3 (cut-off point: 0.836).The HOMA-IR values of subjects with a high GRS and medium GRS were significantly higher than in subjects with a low GRS.Data are shown as mean ± standard deviation.Differences are based on ANOVA adjusted for sex, age and BMI.Bonferroni´s multiple comparisons post-hoc test where groups with different letters are statistically significant, where a > b.The difference is significant p < 0.01.https://doi.org/10.1371/journal.pone.0299543.g001